Add target_batch_size to to_dataframe for sharded target eval#389
Add target_batch_size to to_dataframe for sharded target eval#389hmgaudecker wants to merge 1 commit into
Conversation
The additional_targets DAG in to_dataframe materializes the full in-regime panel on one device. target_batch_size chunks that evaluation and pulls each chunk to host before the next, bounding the fused-DAG device workspace independently of the simulate's subject_batch_size — so the target eval can chunk even when subject_batch_size must stay 0 (a distributed/sharded grid). Defaults to the simulate's subject_batch_size; values are identical to the single-pass evaluation. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Benchmark comparison (main → HEAD)Comparing
|
|
Closing as a false alarm. The device-0 OOM this Fixed at the source by making the constant a host array (OpenSourceEconomics/aca-model#13); the aca-slurm |
Adds a
target_batch_sizeparameter toSimulationResult.to_dataframe, chunking theadditional_targetsevaluation (and evicting each chunk to host) independently of the simulate'ssubject_batch_size.Why
subject_batch_size > 0cannot be combined with distributed (sharded) grids — the value-function array is sharded across devices and can't be gathered onto one, so pylcm rejects the combination. But under a shard the post-simulateadditional_targetseval still pulls the (host-resident, per-shard) panel back onto the mesh's device 0 and evaluates the target DAG over the whole population in one pass, which can exhaust that device. This knob bounds that eval's device residency without touching the sharded solve.target_batch_size=None(default) falls back to the simulate'ssubject_batch_size, so current behavior is unchanged.Test
tests/simulation/test_subject_batching.py::test_to_dataframe_targets_are_invariant_to_target_batch_size— simulates single-pass, then chunks only the target eval attarget_batch_size ∈ {2, 3, 100}(even split, uneven, chunk-larger-than-population), asserting the computed-target column is identical to the single-pass result.🤖 Generated with Claude Code